A Provenance Model for Manually Curated Data
نویسندگان
چکیده
Many curated databases are constructed by scientists integrating various existing data sources “by hand”, that is, by manually entering or copying data from other sources. Capturing provenance in such an environment is a challenging problem, requiring a good model of the process of curation. Existing models of provenance focus on queries/views in databases or computations on the Grid, not updates of databases or Web sites. In this paper we motivate and present a simple model of provenance for manually curated databases and discuss ongoing and fu-
منابع مشابه
Provenance in Manually Curated Databases
Many curated databases are constructed by scientists integrating various existing data sources. Most current approaches to provenance in databases are based on views and fail to take account of the added value of the work done by scientists in manually creating and modifying data. Capturing provenance in such an environment is a challenging problem, requiring changes in practice, changes to exi...
متن کاملA Copy-and-Paste Model for Provenance in Curated Databases
Provenance is information describing the origin, construction, location, ownership, or other aspects of the history of an object. Previous work on provenance has concentrated on an understanding of how provenance is described when the data of interest has been derived by queries from other data sources, as is the case in data warehouses. In this paper we focus on another important class of data...
متن کاملImprov: Flexible Data Provenance for Relational Databases
Curated databases, which consist of data extracted from original sources, printed articles, and other databases, are a valuable source of data for scientists. However, as curated databases aggregate information from multiple sources, the origin of the data elements can be lost. Because of this, curated databases often provide support for data annotations, which are pieces of extra information a...
متن کاملPublishing DisGeNET as nanopublications
The increasing and unprecedented publication rate in the biomedical field is a major bottleneck for knowledge discovery in the Life Sciences. The manual curation of facts from published scientific papers is slow and inefficient, and therefore new approaches are needed that can enable the automatic, scalable and reliable extraction of assertions. While the publication of scientific assertions an...
متن کاملLeveraging the Open Provenance Model as a Multi-tier Model for Global Climate Research
Abstract— Global climate researchers rely upon many forms of sensor data and analytical methods to help profile subtle changes in climate conditions. The U.S. Department of Energy’s Atmospheric Radiation Measurement (ARM) program provides researchers with a collection of curated Value Added Products (VAPs) resulting from continuous sensor data streams, data fusion, and modeling. The ARM operati...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2006